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(57) ABSTRACT 

A method for replicating data in a distributed computer 
environment wherein a plurality of servers are configured 
about one or more central hubs in a hub and spoke arrange- 
ment. In each of a plurality of originating nodes, updates and 
associated origination sequence numbers are sent to the 
central hub. The hub sends updates and associated distribu- 
tion sequence numbers to the plurality of originating nodes. 
The hub tracks acknowledgments sent by nodes for a 
destination sequence number acknowledged by all nodes. 
Thereafter, a highest origination sequence number is sent 
from the central hub back to each originating node. 

22 Claims, 4 Drawing Sheets 
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METHOD, SYSTEM AND COMPUTER that she needs. While out of the office, however, other 

PROGRAM FOR REPLICATING DATA IN A account managers may make changes to the server database 

DISTRIBUTED COMPUTED ENVIRONMENT at the same time that slie is making her own changes. The 

salesperson can re-synchronize the two by repMcating again 

BACKGROUND OF THE INVENTION s over a telephone connection. All updates, additions and 

deletions that were made to the server after she left the office 

1. Technical Field ^j.^ jjq^ replicated to the laptop database, and all updates, 
The present invention relates to replication techniques additions and deletions she made on the laptop database are 

that allow workgroups to connect locally and at the same replicated back to the server database. The replication pro- 
time keep information synchronized across geographically cess also detects update conflicts and flags them for the 
dispersed sites. salesperson and other users to reconcile. 

2. Description of the Related Art There are several known replication techniques that aUow 
Enterprise messaging requirements are evolving beyond workgroup users to connect to a local server and at the same 

traditional store-and-forward e-mail to include the Integra- time keep information synchronized across geographically 

tion of groupware/workflow applications and tight coupling 15 dispersed sites. Documents in the replicated database are 

with the "browsing" model of corporate intranets. Another composed of fields. When two servers desire to synchronize 

key trend, made possible by the proliferation of Internet their respective version of a given document, the most recent 

standards, is ubiquitous access to information across field entry for each field of the document is often used for 

standards-based networks and data stores. At the same time, replication purposes. If timely replication is desued, updates 

the messaging infi-astructure must be extended beyond the to one replica are propagated to other replicas as soon as 

enterprise to business partners, customers and suppliers, to possible. For example, in Lotus Notes Clustering Release 

provide a significant return on investment in electronic 4.5, replication is effected by having every server convey an 

messaging technologies. update to every other server whenever a local update occurs. 

As a result of these new imperatives, enterprise and approach suffers from the drawback of not being 

inter-enterprise message traffic is expanding quickly beyond ^^adily scaleable. Another approach is scheduled 

the limitations of disparate legacy systems, loosely coupled rephcation' , whercm a pair of servers penodically wake up 

and separate mail and intranet systems, and the multitude of and compare data sets. This requires every data set on both 

gateways connecting them. Indeed, companies are now servers to be compared and is stnctly a two way operation, 

faced with the task of consoUdating heterogeneous e-mail Scheduled replication is costly and cannot be done m a 

systems, delivering access to new sources of information, timely fashion, and it a^ creates a significant amount of 

and building a robust messaging infrastructure that meets undesirable network traffic. 

current and expected enterprise requirements. Other known techniques (e.g., Microsoft Exchange) pro- 

A known enterprise messaging solution is Lotus® vides a simple fii^t generation messaging-^^ 

Notes®, which is a platform-independent, distributed client- ^^^eme This technique rehes on store-and-forward mail to 

server Architecture Domino™ servers and Lotus Notes® P^^ changes firom one server to other defined rephcas on 

clients together provide a reUable and flexible electronic ^^her servers. Tliere is no comparison operation, however, to 

mail system that 'pushes'' mail to recipients using industry S^^'^'^^ f^^f ^^Pl^<^^^ synchroni^d. Such a system 

standard message routing protocols, and that faciUtates a significantly increases admmistrative and end-user burden, 

"pull" paradigm in which i^ers have the option to embed a Moreover, if a user changes even a smgle property or field 

Unk to an object in a message. The object can reside in a of a document, the entire document must ^ 

Domino database, an HTrpfbased "intranet" data store, a than just the property or field. Netscape Suitespot uses proxy 

page on the World Wide Web, or even a Windows® OLE ^^^^'f y ^^^^^^ Web pages, which reduces net- 

f- T T * KT * 1 u*i • * ♦ „ ^ work bandwidth requirements. This techmque, however, is 

Unk. Lotus Notes also tightly integrates groupware applica- vvwiivua m . j.. / * i 

o- ./ o or merely duplication — copymg files from a distant place to a 

J 1 closer place — and there is no relationship between the 

Groupware connects users across tune and geography, ^ replication mechanism, 

leveraging the power of network .^^^^^^^^^^ There remains a need to provide enhanced repHcation 

networks pr^ent one of the biggest ^^^^f^^'^^^^^^ schemes that address the deficiencies in the prior art. 

implementation. Connections are sometunes unavailable or ^ 

inadequate for demanding tasks. While this can be due to 50 BRIEF SUMMARY OF THE INVENTION 

failure, there are many other reasons including, without . u- * p*u' • *• ™t;«„*^ ^«t„ ^« 

I -./ u-1 * ^ t,;„K *,or.c™;c It is a primary object of this invention to replicate data m 

hmitation, mobile users, remote offices, and high transmis- . 1 c 

/ ^ / , , ' ,„„^vT«« a timely manner across a large number of nodes, 

sion costs. Groupware has to keep users working together ^ , .-r..- • j 

through all these scenarios. The technology that makes this It is another prunary object of this invention to provide 

possible is so-called replication. A replication mechanism 55 replication enhancements in a distributed system that sig- 

puts information wherever it is needed and synchronized nificantly reduce network traffic. 

changes between replicas. It is sUU another primary object of this invention to 

Ihus, for example, using Lotus Domino™ repHcation provide high performance, realtime rephcation in a 

services, an organization wishing to deploy a Web applica- geographicaUy-dispersed network topology, 

tion to multiple locations may set up servers in each loca- 60 Still another primary object is to provide a simple reph- 

tion. As data is changed in each location, the architecture cation mechanism that is highly scaleable. 

ensures that databases are synchronized through replication. A particular object of this invention is to configure a 

As another example, a salesperson who pays frequent visits replication mechanism within a hub and spoke network 

to customer sites also needs to stay connected to the data- architecture. 

bases and information at her home office. When she leaves 65 Still another particular object is to enable sliding window 

the office with a laptop computer, she makes a copy or acknowledgment through the hub on a broadcast to nodes in 

replica of the lead tracking and customer service databases the network architecture. 
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Yet another object of the present invention is to enable a lowest unacknowledged distribution sequence number, 

spokes to issue periodic acknowledgments to the central Upon failure of a given originating node, the node is isolated 

hub, and for the central hub to issue periodic acknowledg- from other nodes in the given distribution group. If the failed 

ments to the originating spokes, wherein such acknowledg- node later resurfaces, a node failure recovery scheme is 

ments effectively indicate the vitality of the nodes within the S unplemented. Id particular, the failed originating node is 

system as well as any need for padcet retransmission. associated with another originating node (a "buddy" node) 

Still another object of this invention is to provide a hub in l^e given distribution group. The failed node is then 

recovery mechanism to enable a set of server nodes to provided with a current copy of a data set from the buddy 

transfer operation from a faHed hub to a new central hub. node. The failed node rejoins the group and becomes eligible 

Also, the invention provides a spoke failure mechanism for lO for updates at the start of the copy from the buddy node, 

isolating a failed hub from a group of destination nodes that although actual transmission is likely to be delayed until 

are targeted to receive update(s), and for selectively after the node tells the hub its copy has completed, 

re-admitting the failed hub back into the destination group. The replication mechanism also implements a hub recov- 

A further object of this invention is to implement a ery mechanism. In particular, upon failure of the central hub, 

multilevel repUcation mechanism, for example, wherein first a given subset or quorum of the originating nodes confer to 

level hubs participate as second level spokes in a recursible designate a substitute central hub. Thereafter, hub respon- 

architecture extendible to any depth desired. sibilities are transferred to the subslimte central hub. 

Still another object of this invention is to enable a According to a preferred embodiment, a given originaUng 

pluraUty of updates to be batched, collected and distributed ^ node sends a pluraUty of updates to the central hub in a 

together by the replication mechanism. ^ package. Likewise, the central hub sends a plurality of 

Another more general object is to synchronize multiple updates to a given originating node in a package. Thus, the 

database repUcas in a distributed computer environment. mechamsm processes updates m a batched mamier. 

These databases, for example, may be local repUcas of The foregoing has ouUmed some of the more pertment 

databases on a large number of servers, registry informatioD ^ objects and features of the present invention. These objects 

servers, domain name servers, LDAP directory servers, and features should be construed to be merely iUustrative of 

public key security servers, or the like. some of the more prominent features and applications of the 

-niese and other objects are provided in a method for invention. Many other beneficial results can be attained by 

replicating data in a distributed system comprising a plural- Wlyf g the disclosed invention m a different manner or 

ity of originating nodes associated with a central hub. Each 30 n^odifymg the invention as wiU be descnbed. Accordingly, 

of the plurality of originating nodes sends updates and other objects and a fuller understanding of the m 

associated origination sequencT numbers to the central hub. be had by referring to the foUowmg Detailed DescripUon of 

Agiven update is directed to a distribution group comprising preferred embodiment, 

a set or subset of the originating nodes and typically com- BRIEF DESCRIPTION OF THE DRAWINGS 

prises the changes in a given data set supported in a database 35 „ . . , . j- r 

on each such originating node. According to the method, the . more complete understandmg of the present inven- 

ccntral hub receives, packages and sends the updates with tij)n ^^^^ the advantag^ thereof reference should be made o 

associated distribution sequence numbers to the plurality of ^Uowmg Detailed Description taken m comiection with 

originating nodes. In the central hub, acknowledgments sent accompanymg drawmgs m which: 

by originating nodes are then tracked. Each acknowledg- 40 FIG. 1 is a simplified block diagram of a hub and spoke 

ment preferably identifies a last in-sequence distribution architecture in which the inventive replication mechanism is 

sequence number processed by a respective originating implemented; 

node. The central hub then periodically sends a message to FIG. 2 A illustrates an originating node transmitting an 

each originating node. The message includes information update to the central hub; 

identifying a highest origination sequence number acknowl- 45 piG. 2B illustrates the central hub broadcasting the update 

edged by originating nodes (comprising the given distribu- to a distribution group; 

tion group) and the highest origination sequence number pjQ 2C illustrates node acknowledgment of the update; 

associated with an update received at the central hub from 20 illustrates the central hub returning an acknowl- 

the ongmalmg node. edgment back to the originating node; 

Thus, in the inventive scheme, the originating node 50 3 ^ ^lock diagram illustrating a hub failure 

applies its originaUon sequence number to a given update .^covery mechanism of the present invention; 

and the central hub applies the hub distribution sequence \ , v, 1 j- n * a f \ . 

i_ . L J * f *u * J . T-u A' FIG. 4 IS a block diagram illustrating a node failure 

number to its broadcast of that update. The periodic . . p , ^^^^t;^^. 

, ^..t- 4 •• recovery mechanism of the present invention 
acknowledgment by the central hub tnggers retransmission - c c 
of dropped packets from the nodes and the periodic 55 FIG 5 A is a flowchart lUustrating a rouUne for transfer- 
acknowledgment by the nodes trigger retransmission of ^P^ates from ongmaUng nodes to the central hub; 
dropped packets from the hub. The periodic acknowledg- FIG. 5B is a flowchart illustratmg a routine for broad- 
ments also serve as a "heartbeat" to indicate the vitality of casting updates from the central hub to destination nodes; 
the nodes within the system. FIG. 5C is a flowchart illustrating a node acknowledg- 

According to the replication scheme, updates and associ- 60 ment routine by which individual nodes acknowledge to the 

ated distribution sequence numbers are only sent (by the central hub receipt of updates; 

central hub) to originating nodes having no more than a FIG. 5D is a flowchart illustrating a hub acknowledgment 

permitted quantity of unacknowledged updates. The central routine by which the hub acknowledges updates to the 

hub also rebroadcasts updates and associated distribution nodes; 

sequence numbers to originating nodes whose acknowledg- 65 FIG. 6 is a block diagram illustrating how the inventive 

ments indicate lack of receipt of updates from the central architecture implements a recursive, multilevel replication 

hub. Rebroadcasting begins with an update associated with functionality. 
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DETAILED DESCRIPTION OF THE 
PREFERRED EMBODIMENT 

Preferably, a set of Qodes dispersed geographically are 
organized into a "hub and spoke" topology as illustrated in 
FIG. 1. In this arrangement, a central hub 10 has a plurality 
of originating nodes 12a— n associated therewith. Each origi- 
nating node preferably is a server running a messaging/ 
groupware package such as Lotus Domino, and each server 
has associated therewith a plurality of clients/users (not 
shown). Typically, the central hub is a lightweight applica- 
tion running on a computer, and the computer is preferably 
located on the same physical wire used by its local origi- 
nating nodes. Each originating node located at a "spoke" in 
the topology supports at least one database 14, a transmit 
queue 16 and a receive queue 18. Central hub has a 
destination queue or journal 19. 

Each node in the hub and spoke topology thus preferably 
contains a replica of the database 14. The hub may include 
the actual database but typically does not, and the hub need 
not participate in the scheme as a spoke (except as described 
below with respect to FIG. 6). The database includes a 
plurality of documents or data set(s) that are periodically 
updated by one or more users associated with a given node. 
An update 20 generated at a given node is typically a change 
(i.e. the delta) in a given document or data set supported 
across a subset of the other nodes in the network. Usually, 
the changes occur in given field values of the data set. A 
given node (i.e. one of the spokes in the hub and spoke 
topology) need not always be a server at which changes to 
documents originate. Thus, for example, a given machine at 
such a node (whether server or client) may be simply a slave 
for backup purposes. 

The present invention provides a mechanism for replicat- 
ing an update on other nodes that have the same data set. 
Thus, for example, changes to documents at the field level 
are captured and replicated across the server nodes. The 
central hub and each of the servers preferably include a map 
of the servers that have the same document or data set. A 
given update thus may be targeted to a given "distribution 
group" of servers supporting the same data set. The distri- 
bution group may include all of the originating nodes or 
some subset or quorum thereof. It may also include standby 
backup nodes that are passive (i.e. nodes that originate 
nothing but are still in the distribution group) to permit, for 
example, implementations in which replicas exist on clients 
as well as servers. 

As will be seen, a given node may (but need not) 
"package" one or more updates to the central hub in a 
compressed format (e.g., within a zip file). When a plurality 
of updates are packaged together, this is sometimes referred 
to herein as "batching". The central hub, as will be seen, may 
likewise batch updates for multiple data sets (in a package) 
so that distribution from the central hub may be effected for 
several data sets at one time. On occasion, a particular 
central hub distribution package may include an update 
directed to a given destination node as well as another 
update that the given destination may not care about. In such 
case, the latter update may simply be ignored or discarded. 
Communications between an originating node and the hub 
are typically encrypted and authenticated in a known manner 
(e.g., via a secure sockets layer). 

According to the invention, each package (which may 
include one or more updates) transmitted from a given 
originating node (to the central hub) has an originating 
sequence number associated therewith. This number, 
preferably, is ever increasing. Thus, as illustrated in FIG. 1, 
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the package 13 being transmitted from node 12a has a given 
originating sequence number (e.g., 100) while package 15 
being transmitted from node 12b has a different originating 
sequence number (e.g., 210). Whenever a new update 

5 received at a particular originating node is packaged for 
transmission, that node's sequence number is increased and 
the resulting sequence number is associated with the pack- 
age. In like manner, central hub 10 has a distribution 
sequence number associated therewith which, preferably, is 

10 ever increasing. Whenever a package (including one or more 
given updates from one or more originating nodes) is 
prepared for broadcast by the central hub, the hub's distri- 
bution sequence number is increased and the resulting 
sequence number is associated with the package. 

15 Thus, in the preferred embodiment, a particular origina- 
tion sequence number is associated with each package 
created at a given origination node. Likewise, a hub or 
distribution sequence number is associated with each pack- 
age created at the central hub. Of course, if a given package 

20 includes a single update, the origination sequence number 
(in either case) may be considered associated with just that 
update. 

Using the example originating sequence numbers identi- 
fied above, at a given processing cycle (or point in time) 
server 12fl has the update package 13 with origination 
sequence number 100, server 12b has update package 15 
with origination sequence number 210 as previously 
described. Central hub, by way of example, associates 
package 13 (from node 12a) with a hub package 17 having 
destination sequence number 534, while update 15 (from 
node 12b) is associated with a hub package 21 having 
destination sequence number 535. 

FIGS. 2A-2D illustrate the preferred inventive replication 
2g method of the present invention. Referring first to FIG. 2A, 
assume that the server at node 12b desires to send an update 
to a given data set located on a distribution group. In this 
example, it is assumed that the distribution group includes 
all of the servers illustrated. As noted above, typically the 
central hub and each of the servers include a map of which 
servers support each given data set. As will be described in 
more detail below, the server 12b polls its transmit queue (to 
locate the update), and then compresses and sends the update 
(preferably in a zip package) to the central hub 10 as 
illustrated. Compression is not required. Each update is 
preferably queued at its originating node until an acknowl- 
edgment is later received firom the hub that all nodes in the 
distribution group have received it. As noted above, updates 
may be batched together within a given package. Upon 
receipt of the package, the central hub places the received 
update in its queue or journal for subsequent distribution. 

FIG. 2B illustrates the distribution of this update from the 
central hub 10 to each of the nodes of the distribution group. 
In particular, and as will be described below, the central hub 
55 periodically polls its queue, packages each pending update 
(either singularly or jointly with other updates) and then 
broadcasts the package to the destination nodes comprising 
the distribution group. As illustrated in FIG. 2B by the dotted 
line, the originating node (in the case server 12b) may or 
may not receive the pending update that originated from that 
node. 

Turning now to FIG. 2C, each destination node then 
"acknowledges" back to the central hub. This is preferably 
accomplished by having each node periodically wake up and 
65 determine the last in-sequence update received. This infor- 
mation is then acknowledged back to the central hub, as 
indicated by the arrows, and continues periodically for each 
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node. Thus, a node acknowledges to the central hub the node, although typically actual transmission (of updates) is 
successful receipt of an update by acknowledging that likely to be delayed until after the failed node (now read- 
update or any subsequent update from the hub. mitted to the group) informs the hub that its copy has 
In FIG. 2D, the central hub then acknowledges the completed. The buddy node is not frozen to block it from 
originating node, server I2b, when it determines the last s accepting changes from clients or the hub during the copy 
in-sequence updates have been acknowledged by all spokes process. This node, however, this node is effectively frozen 
in the distribution group. The central hub continues to on each document but is free to take other work between 
acknowledge to each node periodically. Thus, the central documents. 

hub acknowledges the successful broadcast of an update to According to the present invention, the originating node 

all nodes (of the distribution group) by acknowledging that jq applies its origination sequence number to a given update (or 

update or any subsequent update from the same originating package), and the central hub applies the hub distribution 

node. This is the basic replication scheme unless there is a sequence number to its broadcasts. The periodic acknowl- 

given hub or spoke failure, each of which v^dll now be briefly edgment by the central hub triggers retransmission of 

described, dropped packets from the spoke nodes, and the periodic 

A given hub failure is typically indicated by lack of 55 acknowledgment by the nodes trigger retransmission of 

periodic acknowledgments 23 (as was illustrated in FIG. 2D dropped packets from the hub. Thus, the periodic acknowl- 

above) from the central hub to the spoke nodes. As illus- edgments serve as a "heartbeat" to indicate the vitality of a 

trated in FIG. 3, when this condition occurs, the nodes given node within the system. 

confer on designation of a new hub via connection 25. Upon FIGS. 5A-5D illustrate detailed flowcharts showing these 

designation of the new hub 10', updates not yet acknowl- 20 routines. As mil be seen, given steps of these flowcharts are 

edged from all spoke nodes in the distribution group are then carried out at the nodes and/or at the central hub. They are 

retransmitted to the new hub as illustrated by reference preferably implemented as computer software instructions 

numeral 27. New hub 10' may or may not have been a spoke stored in memory and executed by a processor. Thus, one 

node, as the actual hub need not be a physical device. The preferred implementation of the present invention is as a 

hub process may coexist with the server processes on any 25 computer program product in a computer-readable medium, 

node or reside on some other system. In this hub recovery A "spoke" portion of this software is executed on each node, 

operation, each node retransmits to the new hub every and a "central hub" portion is executed on the central hub (or 

update for which the failed hub has not indicated successful on a given hub that takes over in the hub failure mode as 

receipt by every node in the distribution group. Thus, given illustrated in FIG. 3). 

nodes may receive the same update twice (once from the 30 FIG. 5Ais a flowchart describing the preferred routine for 

failed hub, and later from the new hub). transferring updates from each node to the central hub. As 

The node failure mode is illustrated in FIG. 4. In this noted above, this routine preferably is implemented on each 
mode, a given node 12d is determined to have failed by server or other machine operating as an originating node. At 
repeatedly failing to acknowledge 29 the central hub (which step 52, the node periodically polls its transmit queue 
is the normal operation illustrated in FIG. 2D) or by sending 35 looking for updates. As noted above, updates intended for 
acknowledgments which indicate it is incapable of keeping other nodes are stored in the node's transmit queue. At step 
up with the remainder of the distribution group. In such case, 54, if an update is present in the transmit queue, the routine 
the hub 10 "fences" or isolates the failed node 12d from the compresses it (for example, using InfoZip™ freeware or 
other nodes of the distribution group, ceasing further distri- WmZip'^'* shareware) and packages the update for transmis- 
bution to that node and notifying other nodes of this change 40 sion to the central hub. The routine then continues at step 56 
in the distribution group. There are a number of ways to by affixing a node identifier (ID) and an originating sequence 
propagate the change in distribution map with the simplest number to the package. As previously described, each pack- 
being that the hub merely notifies all of the nodes of the age (or each update) includes an associated originating 
change. Another approach would be to notify the other nodes sequence number. At step 58, the node transmits the package 
by piggybacking such notice on a next message, or to 45 to the central hub. As noted above, a given package may 
provide such notice on a demand basis. Preferably, however, include multiple updates in a "batch". For simplicity of 
at least one node should be immediately notified of the following discussion, a given package includes just one 
change in status in the event the hub should later fail, and all update. The routine continues at the central hub with step 60. 
nodes must be notified of the state change (of a given node) At this point, the hub polls its queue or journal for any 
at the time any hub recovery occurs as described below. 50 received updates. This step, for example, may be carried out 
Thus, according to the invention, upon a spoke node failure, every few seconds. At step 62, assuming one or more 
the hub notifies the other nodes of this situation. updates have been received, the hub packages them for 

If the fenced-off node later resurfaces and seeks to transmission. At step 64, and in connection therewith, the 

re-enter the distribution group, preferably the hub 10 first hub af&xes to each package a hub identifier (ID), a hub 

designates a "buddy" node (e.g., node Ue) for the "failed" ss sequence number, the originating node identifier, the origi- 

node. The buddy node will be used to provide a current copy nating node sequence number (associated with the package 

of the data set to the failed node and thus bring the failed in which the update was received), and a destination group 

node back into the distribuUon group. Before this transfer identifier identifying which nodes actually contain the data 

begins, however, the last-acknowledged sequence number is set being updated. At step 66, the central hub control routine 

first taken from the buddy node. If necessary, the hub then 60 places the package and the accompanying information into 

admits the buddy node to the distribution group, although its journal for transmission. 

updates are preferably held (or deferred) pending complc- FIG. 5B is the routine by which updates are broadcast or 
tion of the copy process. The failed node then requests otherwise delivered from the central hub to the spoke nodes, 
(numeral 31) a then-current copy of the data set, which is The preferred technique for distributing updates is interrupt- 
then returned (numeral 33) by the buddy node I2e. The 65 driven Internet Protocol (IP) multicasting, although one of 
failed node rejoins the distribution group and becomes ordinary skill will appreciate that any suitable broadcast 
eUgible for updates at the start of the copy from the buddy technique may be used. The routine begins at step 68 with 
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the control routine advancing to a next element of the queue at step 96, the hub sends acknowledgments to originating 

or journal of packages to be broadcast. At step 70, the nodes (a) for which all packages have been broadcast or 

routine updates, for each spoke node, the highest unac- declared undeliverable to all recipients, or (b) for which no 

knowledged hub sequence number. At step 72, the routine packages have been received for a ^ven, substantial period 

broadcasts the update to nodes with no more than a s of time. This completes the processing, 

(preferably selectable) maximum permitted quantity (count To provide a concrete example, assume that a first origi- 

and/or size) of un acknowledged update packages. Thus, if nating node has sent a package having an origination 

a particular node is far behind in acknowledging or far sequence number 100, while a second originating node has 

behind in receiving (as indicated by acknowledging prior sent a package having an origination sequence number 210. 

update broadcasts), it will not be provided any new updates, The package associated with origination sequence number 

Rather, Ihe node is thus fenced-off or isolated from the rest 100 (from the first originating node) has been given distri- 

of the distribution group according to the node failure bution sequence number 534 and transmitted by the central 

scenario illustrated above in FIG. 4. '^t.S'?^'^' associated with origination sequence 

At step 74, the control routine rebroadcasts where neces- "l^^^^J. (from the second onginatmg node^^^^^^ 

*^ V , . . , r i_i i_ • . eiven distnbution sequence number 535 and transmitted by 

sary to overdue destmation nodes, preferably begmmng at a 15 ^^^^^ ^^^^ distribution sequence number 535 has 

package with the lowest unacknowledged hub sequence ^^^^ acknowledged by each node of the given destination 

number. Thus, for example, if a given node has acknowl- ^^^^^^ ^^^^^ periodic message) tells the first 

edged sequence number 535, but sequence numbers 537 and originating node that its update (corresponding to origina- 

538 have already been sent, this step would rebroadcast ^qjj sequence number 100) has been received by all nodes 

beginning with sequence number 536. At step 76, the routine 20 comprising the relevant distribution group. This is neces- 

declares a given destination node xmdeliverable when a sarily the case because if the package associated with the 

selectable maximum permitted retry threshold is reached. distribution sequence number 535 was received, it follows 

Thus, step 76 occurs when the node does not send a timely that the earlier package (i.e. the one associated with the 

acknowledgment or when its acknowledgments indicate that lower distribution sequence number 534) was also properly 

it is unable to keep pace with the distribution group. Again, 25 received and processed. The central hub also sends a mes- 

this scenario causes the spoke to be isolated from the sage to the second originating spoke confirming that its 

destinaUon group as previously described. At step 78, the update (i.e. the one corresponding to the highest distribution 

routine checks to see which destination group nodes are in sequence number 535) was received, 

the process of creating or recovering the database. At step Thus, when the central hub issues its acknowledgment 

79, the central hub marks those nodes and makes no attempt 30 back to the originating nodes, it must first perform a trans- 

to send further updates until the importation is complete. lation of the distribution sequence number to the respective 

The destination nodes preferably inherit the last sequence origination sequence number. Continuing with the above 

number acknowledged to the hub from the destination node example, once the originating nodes acknowledge distribu- 

from which they are importing the database at the time lion sequence number 535, central hub translates this num- 

importation begins. 35 ber back to the origination sequence number (in this case 

FIG, 5C is a flowchart describing the node acknowledg- 210) which is delivered to the origin node. For the other 

ment routine. It begins at step 80 by evaluating the highest nodes, the central hub identifies the closest (but lower) 

distribution sequence number received in sequence. Step 82 distribution sequence number (in this case 534) and then 

evaluates the highest distribution sequence number translates that number to its correspondmg ongination 

acknowledged back to the central hub. At step 84, the routine 40 sequence number (in this case, 100). 

periodically generates an acknowledgment to the central hub In this manner, it can be seen that the central hub 

when the numbers (obtained at steps 80 and 82) differ or acknowledges succcssfiil broadcast of an update to all nodes 

when no updates have been received for a given, selectable by acknowledging that update or any subsequent update 

time period. At step 86, the central hub evaluates the lowest from the same originating node. Likewise, a node acknowl- 

distribution sequence number of the package for which all 45 edges successful receipt of an update to the hub by acknowl- 

destinations have acknowledged receipt. The routine then edging that update or any subsequent update from the hub. 

continues at step 88 with the hub updating the need for The above-described scheme facilitates a so-called "sliding 

acknowledgments to nodes whenever the lowest sequence window" acknowledgment through the central hub on a 

number of the package for which all destinations have broadcast to a set of originating nodes. In particular, when a 

acknowledged receipt is updated. Preferably, this step is 50 given originating node sends a given update to the central 

accomplished using a bit mask over the originating nodes. hub, it does not have to wait for a specific acknowledgment 

FIG. 5D is a flowchart illustrating a hub acknowledgment to know that the given update, in fact, was received by the 

routine. For each originating node, the central hub maintains target nodes. Indeed, the given originating node will know 

a highest distribution sequence number received and a this to be the case if it receives a highest origination 

highest distribution sequence number acknowledged to its 55 sequence number from the central hub that corresponds to a 

origin. At step 90, the highest distribution sequence number later-transmitted update. 

received is evaluated. The highest distribution sequence FIG. 6 illustrates how the present invention may be 

number acknowledged to origin (i.e. the originating node) is implemented in a multilevel, recursive architecture. In this 

evaluated at step 92, At step 94, the central hub periodically embodiment, first level hubs 10 and 10' each have associated 

sends acknowledgments to the originating nodes. Such 60 therewith a set of originating nodes 12fl-12/t as previously 

acknowledgments are sent in the form of messages. described. First level hubs 10 and 10', however, participate 

Generally, a message is sent (from the central hub to a given as second level spokes with respect to a second level hub 

originating node) that includes information identifying a 10". Thus, the second level hub 10" receives updates from 

highest origination sequence number acknowledged by all the first level spokes (really the first level hubs). In like 

nodes of the distribution group and the highest origination 65 manner, this architecture is repeatable for any desired level 

sequence number associated with the package last transmit- of recursion. This facilitates a robust and scaleable repUca- 

ted from that originating node and received at the hub. Thus, tion mechanism. 
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A representative server/central hub is a computer having 
an operating system and support for network connectivity. 
Thus, for example, a representative computer comprises a 
computer running Windows NT (Intel and DEC Alpha), 
IBM OS/2, IBM AIX, HP-UX, Sun Solaris (SPARC and 
Intel Edition), Novell NetWare or Windows '95. 

As noted above, one of the preferred embodiments of the 
routines of this invention is as a set of instructions (computer 
program code) in a code module resident in or downloadable 
to the random access memory of a computer. 

In addition, although the various methods described are 
conveniently implemented in a general purpose computer 
selectively activated or reconfigured by software, one of 
ordinary skill in the art would also recognize that such 
methods may be carried out in hardware, in firmware, or in 
more specialized apparatus constructed to perform the 
required method steps. 

What is claimed is: 

1. A method for replicating data in a distributed system, 
comprising the steps of; 

from each of a plurality of originating nodes, sending 
updates and associated origination sequence numbers 
to a central hub, wherein a given update is directed to 
a distribution group comprising a set of the originating 
nodes; 

from the central hub, sending updates and associated 
distribution sequence numbers to the plurality of origi- 
nating nodes; 

in the central hub, tracking acknowledgments sent by 
originating nodes, each acknowledgment identifying a 
last in-sequence distribution sequence number pro- 
cessed by a respective originating node; and 

in the central hub, periodically sending a message to each 
originating node, the message including information 
identifying a highest origination sequence number 
acknowledged by originating nodes comprising the 
given distribution group. 

2. The method as described in claim 1 wherein the 
message further includes the highest origination sequence 
number associated with an update received at the central hub 
from the originating node. 

3. The method as described in claim 1 wherein an update 
remains queued at its originating node at least until the 
message from the central hub indicates the update has been 
received by the distribution group. 

4. The method as described in claim 1 wherein updates 
and associated distribution sequence nimibers are sent to 
originating nodes with no more than a permitted quantity of 
unacknowledged updates. 

5. The method as described in claim 1 further including 
the step of: 

from the central hub, rebroadcasting updates and associ- 
ated distribution sequence numbers to originating 
nodes whose acknowledgments indicate lack of receipt 
of updates from the central hub. 

6. The method as described in claim 5 wherein the 
rebroadcasting step begins with an update associated with a 
lowest unacknowledged distribution sequence number. 

7. The method as described in claim 1 further including 
the step of: 

upon failure of a given originating node, isolating the 
given originating node from other nodes in the given 
distribution group. 

8. The method as described in claim 1 further including 
the steps of: 

subsequently associating the given originating node with 
another originating node of the given distribution 
group; and 
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readmitting the given originating node to the given dis- 
tributing group. 
9. The method as described in claim 1 further including 
the step of: 

5 upon failure of the central hub, having a given set of the 
originating nodes confer to designate a substitute cen- 
tral bub; and 

transferring hub responsibilities to the substitute central 
hub. 

30 10. The method as described in claim 9 further including 
the step of having each originating node retransmit to the 
substitute central hub each update for which the central hub, 
upon failure of the central hub, had not indicated successful 
receipt by every node of the plurality of originating nodes. 

11. The method as described in claim 1 wherein a given 
originating node sends a pluraUty of updates to the central 
hub in a package. 

12. The method as described in claim 11 wherein the 
central hub sends a plurality of updates to a given originating 
node in a package. 

13. A computer program product in a computer-readable 
medium for replicating data in a distributed system com- 
prising a plurality of originating nodes associated with a 
central hub, wherein origination nodes send updates and 
associated origination sequence numbers to the central hub, 
comprising: 

wherein a given update is directed to a distribution group 
comprising a set of the originating nodes, 

means operative in the central hub for sending updates 
and associated distribution sequence numbers to the 
plurality of originating nodes; 

means operative in the central hub for tracking acknowl- 
edgments sent by originating nodes, each acknowledg- 
ment identifying a last in-sequencc distribution 
2^ sequence number processed by a respective originating 
node; and 

means operative in the central hub for periodically send- 
ing a message to each originating node, the message 
including information identifying a highest origination 
sequence number acknowledged by originating nodes 
comprising the given distribution group. 

14. The computer program product as described in claim 
13 wherein the message further includes the highest origi- 
nation sequence number received at the central hub from the 

45 originating node. 

15. The computer program product as described in claim 
13 further including means operative in the central hub for 
rebroadcasting updates and associated distribution sequence 
numbers to originating nodes whose acknowledgments indi- 
go cate lack of receipt of updates during a given time period. 

16. The computer program product as described in claim 
13 further including means operative in the central hub for 
isolating a given originating node from a given distribution 
group. 

55 17. The computer program product as described in claim 
13 further including means for packaging a plurality of 
updates in a batch. 

18. An apparatus for replicating data in a distributed data 
processing system, the apparatus comprising: 
6Q receiving means for receiving updates and associated 
origination sequence numbers from a plurality of origi- 
nating nodes; 

first sending means for sending updates and associated 
disribution sequence numbers to the plurality of origi- 
65 nating nodes; 

tracking means for tracking acknowledgments sent by 
originating nodes, each acknowledgment identifying a 
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last in-sequence distribution sequence number pro- 
cessed by a respective originating node; and 
second sending means for periodically sending a message 
to originating nodes, the message comprising informa- 
tion identifying a highest origination sequence number ^ 
acknowledged by originating nodes. 

19. The apparatus as described in claim 18 wherein the 
message further comprises the highest origination sequence 
number received from the originating node. 

20. The apparatus as described in claim 18 further com- '^^ 
prising: 

means for rebroadcasting updates and associated distri- 
bution sequence numbers to originating nodes whose 
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acknowledgments indicate lade of receipt of updates 
during a given time period. 

21. The apparatus as described in claim 18 for compris- 
ing: 

means for isolating a given originating node from a given 
distribution group. 

22. The apparatus as described in claim 18 further com- 
prising: 

means for packaging a plurality of updates in a batch. 
♦ * ♦ ♦ * 
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